DepWatch 🛡️

Dependency vulnerability scanner powered by OSV.dev and your choice of LLM.

Upload a requirements.txt or package.json. DepWatch queries every pinned dependency against the OSV.dev vulnerability database, then sends the raw findings to an AI provider — Anthropic Claude or Alibaba Qwen — which explains each CVE in plain English, constructs a realistic exploit scenario, and recommends a specific remediation. Results are ranked by urgency and persisted to Postgres so you can track a project's vulnerability profile over time.

Features
Screenshots
How It Works
Tech Stack
Project Structure
Supported File Formats
AI Providers
OSV.dev — Data Source
Prerequisites
Quick Start — Docker Compose
Local Development — Without Docker
Environment Variables Reference
API Reference
Running Tests
CI / GitHub Actions
Database & Migrations
Design Decisions
Troubleshooting
Contributing
Licence

Features

Feature	Detail
File upload	`requirements.txt` (Python/pip) and `package.json` (Node/npm)
Batch OSV queries	All dependencies checked in a single HTTP call — no API key required
AI enrichment	Plain-English CVE explanation, exploit scenario, and version-specific fix
Dual AI provider	Switch between Anthropic Claude and Alibaba Qwen with one env var
Urgency ranking	Every CVE classified as Immediate, Soon, or Low Priority
Severity scoring	CVSS scores extracted from OSV data; Critical / High / Medium / Low labels
Summary dashboard	At-a-glance counts: total deps, vulnerable packages, breakdown by urgency and severity
Sortable table	Vulnerabilities sorted by urgency then CVSS score; each row expands to show full detail
Scan history	Every scan persisted to Postgres; full reports accessible at any time
Graceful degradation	If the AI provider is unreachable, raw OSV data is returned with auto-generated stubs — the scan never silently fails
Docker Compose	One command boots Postgres + FastAPI backend + Reflex frontend
GitHub Actions CI	pytest + Ruff on every push; runs fully offline (SQLite, no network)

Screenshots

Upload screen	Scan report	History

How It Works

┌──────────────────────────────────────────────────────────────────────────────┐
│  Browser  ·  Reflex UI  (port 3000)                                          │
│  Pure Python → compiles to Vite + React at build time                        │
└─────────────────────────────────┬────────────────────────────────────────────┘
                                  │  HTTP POST multipart/form-data
                                  ▼
┌──────────────────────────────────────────────────────────────────────────────┐
│  FastAPI backend  (port 8000)                                                │
│                                                                              │
│  POST /api/v1/scan/                                                          │
│  ┌─────────────────────────────────────────────────────────────────────┐     │
│  │ 1. Receive UploadFile (≤ 5 MB)                                      │     │
│  │ 2. Detect type from filename (.txt → requirements, .json → npm)     │     │
│  │ 3. Parse → extract pinned (==) dependencies only                    │     │
│  │ 4. POST /v1/querybatch → OSV.dev ──────────────────────────────────────► │
│  │ 5. Flatten CVEs → batch prompt → AI provider ◄──────────────────────── │
│  │    (Anthropic claude-sonnet-4-20250514  OR  Qwen qwen-plus)         │     │
│  │ 6. Persist Scan + Vulnerability rows to Postgres                    │     │
│  │ 7. Return ScanResponse JSON                                         │     │
│  └─────────────────────────────────────────────────────────────────────┘     │
│                                                                              │
│  GET /api/v1/history/          paginated scan list                           │
│  GET /api/v1/history/{id}      full report for a past scan                  │
│  GET /health                   liveness probe                                │
└──────────────────────────────────────────────────────────────────────────────┘
                                  │
                                  ▼
                      ┌───────────────────────┐
                      │  PostgreSQL 17         │
                      │  scans                 │
                      │  vulnerabilities       │
                      └───────────────────────┘

Request lifecycle in detail

Parse — app/parsers/requirements.py or app/parsers/package_json.py extracts (name, version, ecosystem) tuples. Only exact-version pins (== / bare semver) are retained; ranges, VCS URLs, and wildcards are skipped and returned to the caller as skipped_lines.
OSV query — app/services/osv_client.py builds a single POST /v1/querybatch payload. OSV returns vulnerability lists in the same order as the query, so order is always preserved. Chunks of 100 are used as a safety margin below OSV's 1 000-item batch limit.
AI enrichment — app/services/ai_enrichment.py flattens all (dep, [OsvVulnerability]) pairs into a JSON array and sends it to the configured provider in a single call (sub-batched at 50 CVEs to respect context limits). The system prompt — defined as a constant in app/services/prompts.py — instructs the model to return a strict JSON array matching the VulnerabilityResult schema. Three fallback levels handle malformed responses: direct parse → embedded-array extraction → stub generation.
Persist — A Scan row and one Vulnerability row per CVE are written inside the same get_db session. The commit is handled by the get_db dependency, not the route handler.
Respond — The route returns a ScanResponse with the full vulnerability list, summary statistics, and scan metadata.

Tech Stack

Layer	Library	Version	Notes
Frontend	Reflex	0.8.27	Pure Python → Vite + React
Backend framework	FastAPI	0.115.6	Async, OpenAPI auto-docs
ASGI server	Uvicorn	0.32.1	With `standard` extras (watchfiles, httptools)
File upload	python-multipart	0.0.20	FastAPI `UploadFile` dependency
HTTP client	httpx	0.28.1	Async; used for OSV + Reflex→backend calls
Data validation	Pydantic v2	2.10.4	Models for OSV responses and API shapes
Settings	pydantic-settings	2.7.0	`.env` loading with type coercion
ORM	SQLAlchemy	2.0.36	Async 2.0 style
DB driver	asyncpg	0.30.0	Async PostgreSQL driver
Migrations	Alembic	1.14.0	Async-aware `env.py`
Database	PostgreSQL	17 (Alpine)	UUID PKs, timezone-aware timestamps
AI — Anthropic	anthropic	0.43.0	`AsyncAnthropic` client
AI — Qwen	openai	1.58.3	OpenAI SDK pointed at Dashscope
Testing	pytest	8.3.4	+ pytest-asyncio 0.24.0, pytest-httpx 0.35.0
Linting	Ruff	0.9.1	Lint + format check in CI
Containerisation	Docker Compose	v2	Three services: db, backend, frontend
CI	GitHub Actions	—	Ubuntu latest, Python 3.12

Project Structure

depwatch/
│
├── .github/
│   └── workflows/
│       └── ci.yml                    # pytest + ruff on push / PR to main
│
├── backend/
│   ├── app/
│   │   ├── __init__.py
│   │   ├── config.py                 # pydantic-settings: all env vars + provider helpers
│   │   ├── database.py               # async engine, AsyncSessionLocal, Base, get_db
│   │   ├── main.py                   # FastAPI app factory: lifespan, CORS, router mounts
│   │   │
│   │   ├── models/
│   │   │   ├── __init__.py
│   │   │   └── scan.py               # Scan + Vulnerability SQLAlchemy ORM models
│   │   │
│   │   ├── parsers/
│   │   │   ├── __init__.py
│   │   │   ├── requirements.py       # requirements.txt → DependencyItem list
│   │   │   └── package_json.py       # package.json → DependencyItem list
│   │   │
│   │   ├── routers/
│   │   │   ├── __init__.py
│   │   │   ├── scan.py               # POST /api/v1/scan/ — full scan pipeline
│   │   │   └── history.py            # GET /api/v1/history/ + GET /api/v1/history/{id}
│   │   │
│   │   ├── schemas/
│   │   │   ├── __init__.py
│   │   │   ├── osv.py                # Pydantic models mirroring OSV.dev API response
│   │   │   └── scan.py               # API-layer request/response schemas
│   │   │
│   │   └── services/
│   │       ├── __init__.py
│   │       ├── prompts.py            # VULNERABILITY_ENRICHMENT_SYSTEM_PROMPT constant
│   │       ├── osv_client.py         # async httpx OSV.dev batch client
│   │       └── ai_enrichment.py      # BaseEnrichmentService + Anthropic/Qwen subclasses
│   │
│   ├── alembic/
│   │   ├── env.py                    # async-aware Alembic environment
│   │   └── versions/
│   │       └── 0001_initial.py       # scans + vulnerabilities tables + indexes
│   │
│   ├── tests/
│   │   ├── conftest.py               # in-memory SQLite fixtures + httpx ASGI client
│   │   ├── test_parsers.py           # 13 parser unit tests
│   │   ├── test_osv_client.py        # 7 OSV client tests (httpx mocked)
│   │   ├── test_ai_enrichment.py     # 24 enrichment tests (both providers + factory)
│   │   └── test_routes.py            # 12 route integration tests
│   │
│   ├── alembic.ini
│   ├── Dockerfile                    # multi-stage build, non-root runtime user
│   ├── pytest.ini                    # asyncio_mode = auto
│   └── requirements.txt
│
├── frontend/
│   ├── depwatch/
│   │   ├── __init__.py
│   │   └── depwatch.py               # complete Reflex app: AppState + all pages
│   ├── Dockerfile                    # Python 3.12 + Node 20 LTS
│   ├── requirements.txt
│   └── rxconfig.py                   # Reflex port config (frontend 3000, backend 3001)
│
├── .env.example                      # all variables documented with defaults
├── .gitignore
├── docker-compose.yml                # db + backend + frontend with health checks
└── README.md

Supported File Formats

`requirements.txt` — Python / pip

DepWatch extracts only pinned dependencies (the == operator). This is an intentional constraint: OSV.dev's /querybatch API requires an exact version string. Range specifiers like >=1.0,<2.0 cannot be resolved to a single version without running pip install; scanning them would produce unreliable results.

What is scanned:

flask==3.0.0
requests==2.31.0
uvicorn[standard]==0.32.1        # extras ([...]) are stripped — name becomes "uvicorn"
Django==4.2.0 ; python_version>="3.10"  # environment markers stripped

What is skipped (returned in the API response as skipped_lines for transparency):

flask>=3.0.0                     # range constraint
requests~=2.28                   # compatible release
sqlalchemy                       # unpinned — no version at all
git+https://github.com/org/pkg   # VCS URL
-r other-requirements.txt        # include directive
--index-url https://pypi.org/    # option flag
https://example.com/pkg.whl      # direct URL

`package.json` — Node.js / npm

DepWatch reads dependencies, devDependencies, and peerDependencies. Duplicate packages (appearing in more than one group) are de-duplicated, with the first occurrence winning.

What is scanned (exact semver only):

{
  "dependencies": {
    "express": "4.18.2",
    "semver":  "=7.5.4"
  }
}

What is skipped:

{
  "dependencies": {
    "lodash":    "^4.17.21",
    "axios":     "~1.6.0",
    "react":     ">=18.0.0",
    "my-lib":    "file:../my-lib",
    "workspace": "workspace:*",
    "from-git":  "github:user/repo"
  }
}

Tip: To maximise scan coverage, pin all your production dependencies. Run pip freeze > requirements.txt or npm shrinkwrap to generate a fully-pinned lockfile suitable for DepWatch.

AI Providers

DepWatch supports two interchangeable AI providers, selected by a single environment variable. Both receive the same system prompt and return the same structured JSON — switching providers requires no code changes.

Anthropic Claude (default)

AI_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-...

Default model: claude-sonnet-4-20250514
Override: ANTHROPIC_MODEL=claude-opus-4-5 (or any available model)
SDK: anthropic Python SDK, AsyncAnthropic client
Get a key: https://console.anthropic.com

Claude is called via messages.create() with a system parameter containing the structured enrichment prompt. The response is the first content[0].text block.

Alibaba Qwen via Dashscope

AI_PROVIDER=qwen
QWEN_API_KEY=sk-...

Default model: qwen-plus — balanced capability and cost for structured JSON
Override: QWEN_MODEL=qwen-max (highest capability) or qwen-turbo (fastest / cheapest)
SDK: openai Python SDK, AsyncOpenAI client pointed at the Dashscope endpoint
Endpoint: https://dashscope.aliyuncs.com/compatible-mode/v1 (OpenAI-compatible)
Get a key: https://dashscope.aliyuncs.com

Because Dashscope's API is fully OpenAI-compatible, no Qwen-specific SDK is required. The openai SDK is used with a custom base_url. Qwen calls include response_format={"type": "json_object"} to encourage strict JSON output (supported by qwen-plus and qwen-max; remove this if using qwen-turbo).

Provider comparison

	Anthropic Claude Sonnet	Qwen Plus	Qwen Max
JSON reliability	Excellent	Very good	Excellent
Context window	200k tokens	131k tokens	32k tokens
Speed	Fast	Fast	Moderate
Cost	$$	$	$$
Best for	Production use	Cost-sensitive / China region	Maximum accuracy

How provider selection works

AI_PROVIDER=anthropic  →  AnthropicEnrichmentService  (anthropic SDK)
AI_PROVIDER=qwen       →  QwenEnrichmentService        (openai SDK + Dashscope)
(missing key)          →  StubEnrichmentService         (OSV data + auto labels)

The factory function get_enrichment_service() in app/services/ai_enrichment.py returns a cached singleton. All three classes extend BaseEnrichmentService, which owns the JSON-parsing pipeline (direct parse → embedded-array extraction → stubs). Subclasses implement only _call_api(user_message) -> str.

Graceful degradation

If the selected provider's API key is absent or the API call fails at runtime, BaseEnrichmentService catches the exception and falls back to stub objects. Stubs are auto-populated from raw OSV data:

plain_english_explanation ← OSV summary field
severity ← derived from CVSS score
urgency ← derived from severity label
remediation ← "Upgrade to the latest patched version."

The scan response is returned successfully — the user sees real CVE identifiers and CVSS scores even without AI enrichment.

OSV.dev — Data Source

OSV.dev (Open Source Vulnerabilities) is a free, open vulnerability database maintained by Google. It aggregates advisories from:

Source	Ecosystems
GitHub Security Advisories (GHSA)	All GitHub-hosted packages
CVE Programme (NVD)	Cross-ecosystem CVEs
PyPI Advisory Database	Python / pip
npm Advisory Database	JavaScript / npm
RustSec	Rust / Cargo
Go Vulnerability Database	Go modules
OSS-Fuzz	C, C++ and more
…and many more	Maven, NuGet, Hex, Pub, etc.

How DepWatch uses OSV

DepWatch calls POST https://api.osv.dev/v1/querybatch with all pinned dependencies in a single request body:

{
  "queries": [
    { "package": { "name": "flask",    "ecosystem": "PyPI" }, "version": "1.0.0" },
    { "package": { "name": "requests", "ecosystem": "PyPI" }, "version": "2.25.0" }
  ]
}

OSV returns results in the same order as the queries. Each result is a list of OsvVulnerability objects, which may include:

id — OSV identifier (e.g. GHSA-xxxx-yyyy-zzzz) or CVE-YYYY-NNNNN
aliases — cross-references including CVE IDs when the primary ID is a GHSA
severity — CVSS vector string (e.g. CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H)
summary — one-line description
details — full advisory text

DepWatch extracts a numeric CVSS base score from the vector string using a three-level fallback: direct float parse → trailing number in vector → qualitative label mapping (HIGH → 7.5, CRITICAL → 9.5, etc.).

No API key is required. OSV.dev is fully free and open.

Prerequisites

Requirement	Minimum version	Notes
Git	any	—
Docker Desktop	4.x	Includes Docker Compose v2
Python	3.12	Only needed for local dev without Docker
Node.js	20 LTS	Only needed for local dev without Docker (Reflex build pipeline)
Anthropic API key	—	Required if `AI_PROVIDER=anthropic`. Free tier available.
Qwen / Dashscope key	—	Required if `AI_PROVIDER=qwen`. Free tier available.

Quick Start — Docker Compose

This is the recommended way to run DepWatch. All three services (Postgres, backend, frontend) start with a single command.

# 1. Clone the repository
git clone https://github.com/yourname/depwatch.git
cd depwatch

# 2. Create your .env file from the template
cp .env.example .env

Open .env and set your AI provider:

# Option A — Anthropic Claude (default)
AI_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-your-key-here

# Option B — Alibaba Qwen
# AI_PROVIDER=qwen
# QWEN_API_KEY=sk-your-dashscope-key-here

# 3. Build images and start all services
docker compose up --build

# To run in the background:
docker compose up --build -d

What happens on first boot:

Step	Duration	Detail
Postgres starts	~3 s	Health check waits for `pg_isready`
Backend starts	~5 s	FastAPI runs `Base.metadata.create_all()` on boot
Reflex compiles	~60 s	Vite bundle is compiled once; cached in the `reflex_web` Docker volume on subsequent starts

Once running:

URL	What it is
http://localhost:3000	DepWatch web UI
http://localhost:8000/docs	FastAPI interactive API docs (Swagger UI)
http://localhost:8000/redoc	FastAPI API docs (ReDoc)
http://localhost:8000/health	Liveness probe

Useful Compose commands:

# View logs for a specific service
docker compose logs -f backend
docker compose logs -f frontend

# Stop all services (preserves data volumes)
docker compose down

# Stop and remove all data (full reset)
docker compose down -v

# Rebuild a single service after a code change
docker compose up --build backend

Local Development — Without Docker

Use this approach when you want faster iteration cycles or need to attach a debugger.

Step 1 — Start Postgres

You still need Postgres running. The simplest way is to start just the DB container:

docker compose up db -d
# Postgres is now available on localhost:5432
# User: depwatch  Password: depwatch  Database: depwatch

Or use an existing local Postgres instance and update DATABASE_URL in .env accordingly.

Step 2 — Backend

cd backend

# Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\activate

# Install all dependencies (includes both anthropic and openai SDKs)
pip install -r requirements.txt

# Install the async SQLite driver used only in tests
pip install aiosqlite

# Copy and configure environment
cp ../.env.example .env
# Edit .env: set AI_PROVIDER and the corresponding API key

# Run database migrations
alembic upgrade head

# Start the development server with hot reload
uvicorn app.main:app --reload --port 8000

The API is now running at http://localhost:8000. Interactive docs at http://localhost:8000/docs.

Step 3 — Frontend

Open a second terminal:

cd frontend

python -m venv .venv
source .venv/bin/activate

pip install -r requirements.txt

# First-time initialisation (creates the .web/ directory and installs npm deps)
reflex init

# Start the development server with hot reload
reflex run

The UI is now running at http://localhost:3000. Reflex hot-reloads on Python file saves.

Step 4 — Verify the stack

# Check the backend health endpoint
curl http://localhost:8000/health
# → {"status": "ok", "version": "1.0.0"}

# Test a scan via curl (replace with your own requirements.txt)
curl -X POST http://localhost:8000/api/v1/scan/ \
  -F "file=@/path/to/requirements.txt"

Environment Variables Reference

All variables can be set in the .env file at the project root. Docker Compose reads them automatically. For local development without Docker, place .env in the backend/ directory.

AI Provider

Variable	Default	Required	Description
`AI_PROVIDER`	`anthropic`	No	Active AI provider. Options: `anthropic`, `qwen`

Anthropic

Variable	Default	Required	Description
`ANTHROPIC_API_KEY`	`""`	If `AI_PROVIDER=anthropic`	Your Anthropic API key. Get one at https://console.anthropic.com
`ANTHROPIC_MODEL`	`claude-sonnet-4-20250514`	No	Claude model to use for enrichment

Qwen

Variable	Default	Required	Description
`QWEN_API_KEY`	`""`	If `AI_PROVIDER=qwen`	Your Dashscope API key. Get one at https://dashscope.aliyuncs.com
`QWEN_MODEL`	`qwen-plus`	No	Qwen model. Options: `qwen-turbo`, `qwen-plus`, `qwen-max`, `qwen-long`
`QWEN_BASE_URL`	`https://dashscope.aliyuncs.com/compatible-mode/v1`	No	Dashscope OpenAI-compatible endpoint URL

Database

Variable	Default	Required	Description
`DATABASE_URL`	`postgresql+asyncpg://depwatch:depwatch@localhost:5432/depwatch`	Yes	SQLAlchemy async DSN. Use `postgresql+asyncpg://` scheme

Note: docker-compose.yml overrides DATABASE_URL with the internal Docker network hostname db. You only need to set this manually for local dev without Docker.

Application

Variable	Default	Required	Description
`ENVIRONMENT`	`development`	No	Set to `production` to suppress SQL echo logging
`LOG_LEVEL`	`INFO`	No	Python logging level: `DEBUG`, `INFO`, `WARNING`, `ERROR`

Full `.env` example

# .env — copy from .env.example and fill in your values

AI_PROVIDER=anthropic

ANTHROPIC_API_KEY=sk-ant-api03-...
ANTHROPIC_MODEL=claude-sonnet-4-20250514

# QWEN_API_KEY=sk-...
# QWEN_MODEL=qwen-plus
# QWEN_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1

DATABASE_URL=postgresql+asyncpg://depwatch:depwatch@localhost:5432/depwatch

ENVIRONMENT=development
LOG_LEVEL=INFO

API Reference

The FastAPI backend exposes an OpenAPI spec at http://localhost:8000/docs (Swagger UI) and http://localhost:8000/redoc. All endpoints are prefixed with /api/v1 except /health.

`POST /api/v1/scan/`

Upload a dependency file and receive a full vulnerability report.

Request

Content-Type: multipart/form-data

Field	Type	Description
`file`	`UploadFile`	A `requirements.txt` or `package.json`. Maximum 5 MB.

Response 201 Created

{
  "scan_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "filename": "requirements.txt",
  "file_type": "requirements",        // "requirements" | "package_json"
  "created_at": "2025-03-21T10:30:00Z",
  "summary": {
    "total_dependencies": 24,
    "vulnerable_count": 6,
    "immediate_count": 3,
    "soon_count": 2,
    "low_priority_count": 1,
    "critical_count": 1,
    "high_count": 2,
    "medium_count": 2,
    "low_count": 1
  },
  "vulnerabilities": [
    {
      "id": "7d8e9f10-...",
      "package_name": "flask",
      "package_version": "1.0.0",
      "cve_id": "CVE-2023-30861",
      "cvss_score": 7.5,
      "severity": "High",
      "plain_english_explanation": "Flask before 2.3.2 mishandles the Vary header...",
      "exploit_scenario": "An attacker and victim share a reverse-proxy cache...",
      "remediation": "Upgrade to flask==2.3.3 or later.",
      "urgency": "Immediate"           // "Immediate" | "Soon" | "Low Priority"
    }
    // ...one object per CVE
  ]
}

Error responses

Status	Condition
`413 Request Entity Too Large`	File exceeds 5 MB
`422 Unprocessable Entity`	Unsupported file type, invalid JSON, or no pinned dependencies found
`502 Bad Gateway`	OSV.dev API returned an error or was unreachable

`GET /api/v1/history/`

Return a paginated list of past scans, newest first.

Query parameters

Parameter	Type	Default	Description
`limit`	integer (1–100)	`20`	Number of items to return
`offset`	integer (≥ 0)	`0`	Items to skip (for pagination)

Response 200 OK

{
  "items": [
    {
      "scan_id": "3fa85f64-...",
      "filename": "requirements.txt",
      "file_type": "requirements",
      "total_dependencies": 24,
      "vulnerable_count": 6,
      "created_at": "2025-03-21T10:30:00Z"
    }
  ],
  "total": 42
}

`GET /api/v1/history/{scan_id}`

Return the full vulnerability report for a historical scan.

Path parameter

Parameter	Type	Description
`scan_id`	UUID	The scan ID returned by `POST /scan/` or listed in `GET /history/`

Response 200 OK — same shape as POST /api/v1/scan/

Error responses

Status	Condition
`404 Not Found`	No scan with this ID exists

`GET /health`

Liveness probe for load balancers and Docker health checks.

Response 200 OK

{ "status": "ok", "version": "1.0.0" }

Running Tests

The full test suite runs completely offline — SQLite in-memory replaces Postgres and all external HTTP calls (OSV.dev, Anthropic, Qwen) are intercepted by mocks. No API keys are required to run tests.

Setup

cd backend

# If not already done:
pip install -r requirements.txt
pip install aiosqlite           # async SQLite driver for in-memory test DB

Running the suite

# Run everything
pytest

# Verbose output with test names
pytest -v

# Run a single test module
pytest tests/test_parsers.py -v
pytest tests/test_osv_client.py -v
pytest tests/test_ai_enrichment.py -v
pytest tests/test_routes.py -v

# Run tests matching a keyword
pytest -k "qwen" -v
pytest -k "history" -v

# Stop on first failure
pytest -x

# Show slowest 10 tests
pytest --durations=10

Test modules

`tests/test_parsers.py` — 13 tests

Pure unit tests; no I/O. Cover both parsers exhaustively.

Test	What it verifies
`test_single_pinned_dependency`	Basic `==` pin is extracted
`test_multiple_pinned_dependencies`	Multiple pins in correct order
`test_comments_are_ignored`	`#` lines don't appear in output
`test_blank_lines_are_ignored`	Empty lines don't crash
`test_unpinned_dependency_is_skipped`	`requests` alone goes to `skipped_lines`
`test_range_constraint_is_skipped`	`>=` constraint goes to `skipped_lines`
`test_extras_are_stripped_from_name`	`uvicorn[standard]` → name `uvicorn`
`test_environment_markers_are_ignored`	`; python_version>=...` stripped
`test_vcs_url_is_skipped`	`git+https://...` goes to `skipped_lines`
`test_flag_line_is_skipped`	`-r other.txt` goes to `skipped_lines`
`test_empty_file_returns_empty_result`	No crash on empty input
`test_pre_release_version`	`1.0.0b2` is a valid pinned version
`test_mixed_content`	All types together, correct counts
`test_exact_version_is_extracted` (package.json)	`"4.18.2"` → scanned
`test_caret_range_is_skipped`	`"^4.17.21"` → skipped
`test_dev_dependencies_are_included`	`devDependencies` entries scanned
`test_both_dep_groups_merged`	`dependencies` + `devDependencies` combined
`test_duplicate_package_deduped`	Package in both groups appears once
`test_equals_prefix_stripped`	`"=7.5.4"` → version `7.5.4`
`test_invalid_json_sets_parse_error`	Malformed JSON sets `parse_error`
`test_prerelease_semver_accepted`	`"14.0.0-canary.1"` is valid

`tests/test_osv_client.py` — 7 tests + 3 unit tests

test_query_batch_returns_pairs          Happy path, two deps one vuln each
test_query_batch_empty_input_returns_empty  No HTTP call made
test_query_batch_no_vulns               Empty vuln list for a safe package
test_query_batch_order_preserved        Output order matches input order exactly
test_osv_http_error_raises              HTTP 500 propagates as HTTPStatusError
TestExtractCvssScore::test_plain_float_score        "8.1" → 8.1
TestExtractCvssScore::test_no_severity_returns_none [] → None
TestExtractCvssScore::test_qualitative_high_maps_to_score  "HIGH" → 7.5
TestExtractCvssScore::test_trailing_number_in_vector       CVSS vector/9.8 → 9.8

Uses pytest-httpx to intercept all httpx calls at the transport layer.

`tests/test_ai_enrichment.py` — 24 tests

Structured into four classes:

TestSharedEnrichmentLogic — tests the parsing pipeline once, provider-agnostically:

test_clean_json_array_is_parsed
test_json_embedded_in_prose_is_extracted
test_malformed_response_returns_stubs
test_empty_osv_pairs_returns_empty_no_api_call
test_empty_string_response_returns_stubs
test_urgency_derived_when_missing_from_response
test_multiple_packages_all_returned
test_api_exception_falls_back_to_stubs

TestAnthropicProvider:

test_anthropic_happy_path
test_anthropic_no_key_returns_empty_string
test_anthropic_uses_configured_model

TestQwenProvider:

test_qwen_happy_path
test_qwen_no_key_returns_empty_string
test_qwen_uses_configured_model
test_qwen_system_prompt_passed_correctly
test_qwen_json_object_wrapper_handled
test_qwen_no_choices_returns_stubs

TestGetEnrichmentServiceFactory:

test_factory_returns_anthropic_service_by_default
test_factory_returns_qwen_service_when_configured
test_factory_is_cached
test_reset_clears_cache

`tests/test_routes.py` — 12 tests

Full HTTP round-trip tests using httpx.AsyncClient with ASGI transport. External calls (OSV, AI) are mocked at the service layer.

TestScanEndpoint::
  test_scan_requirements_returns_201
  test_scan_package_json_returns_201
  test_scan_unsupported_file_type_returns_422
  test_scan_invalid_json_returns_422
  test_scan_no_pinned_deps_returns_422
  test_scan_response_has_summary_fields
  test_scan_osv_failure_returns_502
  test_scan_persists_to_db               ← verifies actual DB row creation

TestHistoryEndpoint::
  test_history_returns_200
  test_history_pagination
  test_history_detail_not_found
  test_history_detail_returns_scan

TestHealthEndpoint::
  test_health_returns_ok

Test design rationale

Why SQLite instead of Postgres for tests? Running Postgres in CI requires a service container, adds ~30 seconds of startup overhead, and couples the test suite to infrastructure. SQLite (in-memory via aiosqlite) is structurally identical for every query DepWatch runs. Postgres-specific behaviour — UUIDs, timezone-aware timestamps, cascade deletes — is validated by the Alembic migration file and integration tests.

Why mock at the service layer, not the HTTP layer, for route tests? Mocking get_osv_client() and get_enrichment_service() return values is faster, clearer, and more stable than intercepting HTTP calls from inside a full request cycle.

Why reset_enrichment_service() in factory tests? get_enrichment_service() caches a singleton. Tests that need to control which provider is returned must clear the cache in setup_method and teardown_method to avoid cross-test pollution.

CI / GitHub Actions

The workflow at .github/workflows/ci.yml runs on every push to main or develop and on every pull request targeting main.

Jobs

test — runs on ubuntu-latest, Python 3.12:

Check out repository
Set up Python with pip cache keyed on requirements.txt
pip install -r requirements.txt && pip install aiosqlite
pytest tests/ -v --tb=short

No Postgres service container is required. DATABASE_URL is set to a dummy Postgres URL (never actually connected) and the test session uses the SQLite override from conftest.py.

lint — runs in parallel with test:

pip install ruff==0.9.1
ruff check app/ tests/
ruff format --check app/ tests/

Badge

Add this to your fork's README:

[![CI](https://github.com/yourname/depwatch/actions/workflows/ci.yml/badge.svg)](https://github.com/yourname/depwatch/actions/workflows/ci.yml)

Database & Migrations

Schema

scans

Column	Type	Notes
`id`	`UUID` PK	`uuid4()` default
`filename`	`VARCHAR(255)`	Original upload filename
`file_type`	`VARCHAR(50)`	`"requirements"` or `"package_json"`
`total_dependencies`	`INTEGER`	Count of pinned deps parsed
`vulnerable_count`	`INTEGER`	Unique packages with ≥1 CVE
`created_at`	`TIMESTAMPTZ`	Server-side `now()` default

vulnerabilities

Column	Type	Notes
`id`	`UUID` PK	`uuid4()` default
`scan_id`	`UUID` FK → `scans.id`	`ON DELETE CASCADE`
`package_name`	`VARCHAR(255)`
`package_version`	`VARCHAR(100)`
`cve_id`	`VARCHAR(100)`	CVE-YYYY-NNNNN or GHSA ID
`cvss_score`	`FLOAT`	Nullable — not all OSV entries include a score
`severity`	`VARCHAR(50)`	`Critical / High / Medium / Low / Unknown`
`plain_english_explanation`	`TEXT`	AI-generated. Nullable
`exploit_scenario`	`TEXT`	AI-generated. Nullable
`remediation`	`TEXT`	AI-generated. Nullable
`urgency`	`VARCHAR(50)`	`Immediate / Soon / Low Priority`. Nullable

Indexes: ix_scans_created_at on scans.created_at; ix_vulnerabilities_scan_id on vulnerabilities.scan_id.

Running migrations

cd backend

# Apply all pending migrations
alembic upgrade head

# Check current revision
alembic current

# Generate a new migration from ORM model changes
alembic revision --autogenerate -m "add remediation_url column"

# Roll back one revision
alembic downgrade -1

Automatic table creation

On application startup, app/main.py calls Base.metadata.create_all() via the lifespan handler. This is idempotent — it creates tables that don't exist and does nothing for tables that do. It is not a replacement for Alembic migrations (which handle column additions, renames, and index changes), but it ensures the app works on first run from Docker Compose without a manual migration step.

For production deployments, run alembic upgrade head as an init container or as part of your deployment pipeline before starting the application.

Design Decisions

Single OSV batch call per scan

A single POST /v1/querybatch replaces N individual POST /v1/query calls. For a 30-dependency requirements.txt this cuts OSV round-trips from 30 to 1, reducing scan latency by ~95% and being considerate to OSV's free infrastructure. The client auto-chunks at 100 items to stay safely below OSV's documented 1 000-item limit.

Single AI call per scan (sub-batched at 50)

All CVEs from a scan are sent to the AI provider in one call rather than one call per CVE. For a file with 8 vulnerabilities, this is 8× cheaper and faster. Claude's 200k and Qwen's 131k context windows comfortably accommodate even large scans. At 50+ CVEs the client makes sequential sub-batch calls to stay within typical output token limits.

Provider abstraction via inheritance

BaseEnrichmentService owns all JSON parsing, validation, and stub logic. AnthropicEnrichmentService and QwenEnrichmentService implement only _call_api(user_message) -> str. Adding a third provider (e.g. Gemini, Mistral, Ollama) requires only a new subclass and a one-line addition to the factory function — no changes to the shared parsing pipeline.

Separate ORM and Pydantic schemas

SQLAlchemy models define the database schema; Pydantic schemas define the HTTP contract. A DB column rename doesn't break the API response shape. An API response field addition doesn't require a migration. The two layers can evolve at different paces.

UUID primary keys

Sequential integer IDs expose row counts (/history/5 implies there are at least 5 scans) and make enumeration attacks trivial. UUIDs are safe to expose in URLs, API responses, and logs.

`expire_on_commit=False` on the async session

FastAPI serialises the response object immediately after the route handler returns, inside the same get_db session scope. With SQLAlchemy's default (expire_on_commit=True), attributes are expired after commit(). Accessing them would trigger implicit lazy loads on an async session, causing MissingGreenlet errors. expire_on_commit=False keeps all loaded attributes available for the lifetime of the request.

SQLite for tests

Postgres-in-CI requires a service container, extends pipeline time, and tightly couples test infrastructure to the database engine. SQLite (in-memory, via aiosqlite) is structurally identical for every query DepWatch executes. Postgres-specific behaviour is verified by the Alembic migration and the Docker Compose integration environment.

`get_db` commits on success, rolls back on exception

The get_db dependency commits inside try and rolls back inside except. Route handlers call await db.flush() (not commit()) to write rows within the transaction, letting the dependency control the transaction boundary. This prevents partially-written scans if the serialisation step raises after the insert.

Prompts as constants

VULNERABILITY_ENRICHMENT_SYSTEM_PROMPT is a module-level string constant in services/prompts.py, not an f-string or a database record. This makes it reviewable in code review, diffable in git, and importable in tests. The template is provider-agnostic — it instructs the model in terms of input/output JSON schemas, not in provider-specific syntax.

Troubleshooting

The Reflex frontend takes a very long time to start

The first reflex run (or docker compose up --build) compiles the Vite bundle, which downloads npm packages and runs the build. This takes 60–120 seconds on a cold start. Subsequent starts use the cached .web/ directory (or the reflex_web Docker volume) and complete in ~5 seconds.

If it seems stuck, check the container logs:

docker compose logs -f frontend

Look for the line App running at: http://localhost:3000 — that's when compilation is complete.

`ANTHROPIC_API_KEY is not set` warning at startup

This is expected behaviour when the key is missing. The application will start and scans will succeed, but vulnerability reports will use stubs rather than AI-generated enrichment. To enable full enrichment, add your key to .env and restart the backend.

`AI enrichment unavailable` in scan results

This appears in the plain_english_explanation field when the AI call fails or no key is configured. The scan result is still valid — OSV data (CVE IDs, CVSS scores) is present and correct. Check:

Is ANTHROPIC_API_KEY or QWEN_API_KEY set in .env?
Does AI_PROVIDER match the key you've provided?
Check backend logs: docker compose logs backend | grep "API error"

`No pinned dependencies found` error (422)

DepWatch only scans exact versions. If your file contains only range constraints:

# requirements.txt with ranges — will return 422
flask>=3.0.0
requests~=2.28

Generate a pinned file:

pip freeze > requirements-pinned.txt

Or for npm:

npm shrinkwrap      # generates npm-shrinkwrap.json (rename to package.json)

OSV returns no vulnerabilities for known-vulnerable packages

OSV.dev data is updated continuously but may lag new CVE publications by hours. Also verify:

The package name exactly matches the PyPI or npm registry name (case-insensitive for PyPI, exact-case for npm)
The version is genuinely affected — a patched version may correctly show no vulnerabilities

Database connection errors on startup

The backend waits for Postgres to pass its health check before starting (via the depends_on: condition: service_healthy setting in Compose). If you see connection errors, check:

docker compose ps
# Verify the `db` service shows "(healthy)"

docker compose logs db
# Look for "database system is ready to accept connections"

Port conflicts

Default ports: 3000 (frontend), 3001 (Reflex internal), 8000 (backend), 5432 (Postgres). To change them, edit docker-compose.yml and frontend/rxconfig.py.

Contributing

Contributions are welcome. Please follow this workflow:

# 1. Fork and clone
git clone https://github.com/yourname/depwatch.git
cd depwatch

# 2. Create a feature branch
git checkout -b feat/your-feature-name

# 3. Make changes and add tests

# 4. Verify locally
cd backend
pip install -r requirements.txt aiosqlite
pytest -v                        # all tests must pass
ruff check app/ tests/           # no lint errors
ruff format app/ tests/          # code must be formatted

# 5. Push and open a pull request
git push origin feat/your-feature-name

CI will run automatically on your PR. Merges to main require passing tests and lint.

Adding a new AI provider

Create a subclass of BaseEnrichmentService in app/services/ai_enrichment.py:

class GeminiEnrichmentService(BaseEnrichmentService):
    async def _call_api(self, user_message: str) -> str:
        # call Gemini API, return raw text response
        ...

Add a new Literal option to ai_provider in app/config.py
Add any provider-specific settings fields to Settings
Register the new class in get_enrichment_service()
Add tests in tests/test_ai_enrichment.py following the TestQwenProvider pattern
Update .env.example and this README

Licence

MIT — see LICENSE for full text.

Built with FastAPI · Reflex · OSV.dev · Claude · Qwen

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github/workflows		.github/workflows
backend		backend
frontend		frontend
public		public
.env.example		.env.example
.gitignore		.gitignore
COMPLETION_SUMMARY.md		COMPLETION_SUMMARY.md
LICENSE		LICENSE
PYTHON_314_COMPATIBILITY.md		PYTHON_314_COMPATIBILITY.md
README.md		README.md
SECURITY_AND_FUNCTIONALITY_TEST_REPORT.md		SECURITY_AND_FUNCTIONALITY_TEST_REPORT.md
SECURITY_FIXES_APPLIED.md		SECURITY_FIXES_APPLIED.md
SECURITY_FIXES_COMPLETE_GUIDE.md		SECURITY_FIXES_COMPLETE_GUIDE.md
SECURITY_FIXES_IMPLEMENTATION_GUIDE.md		SECURITY_FIXES_IMPLEMENTATION_GUIDE.md
SECURITY_QUICK_START.md		SECURITY_QUICK_START.md
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation